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Abstract 

The work presented in this paper gives a brief idea about the weight based gender segregation of silk 
moths in cocoon stage. Mathematical modelling of the sensor is also carried out with the concept of 
silk moth sex identification. 
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1. Introduction 

Silk moth sex identification and its automation is one of the important processes in the sericulture 
industry because sex identification can assist in effectively separating the male and female moths in 
early stages of silkworm seed production process which could avoid unregulated mating process and 
automation enhances the precision and mass production. The gender detection and separation 
methods are done based on the average weight of the cocoon during pupa stage using weighing 
machines manually or based on size and physical structure of matured moths are handpicked by 
experts before they can mate. This conventional method of gender detection and separation prone to 
increase in error rate, time consumption, labor and decrease in production rate and quality of eggs. 
In the proposed system, high precision weighing sensor with microcontroller detects the accurate 
weights of the individual training samples and the K-Means linear regression model can able to 
accurately fix the threshold for statistically varying physical weights in nature. Then based on the 
threshold the test samples are segregated into male and female cocoons. 

The linear regression model can identify the threshold from the set of physical weights of individual 
training samples and system can able to segregates 700 to 800 samples per hour which increases the 
rate of segregation process and accuracy tremendously compared to traditional segregation techniques 
by approximately 95%. 

The proposed sorting methods can be used in segregation of any light weight objects and accuracy 
can be improved using artificial intelligence and machine learning algorithms. 
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2. Methodology 
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Fig. 1 : flow chart of project 
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Fig. 2 : Overall block diagram 


4. K-Means Clustering 

k-means clustering is a method of vector quantization, originally from signal processing that is 
popular for cluster analysis in data mining. K-means clustering aims to partition n observations into 
k clusters in which each observation belongs to the cluster with the nearest mean, serving as a 
prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. The problem 
is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are 
commonly employed and converge quickly to a local optimum. These are usually similar to the 
expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement 
approach employed by both k-means and Gaussian mixture modeling. Additionally, they both use 
cluster centers to model the data; however, k-means clustering tends to find clusters of comparable 
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spatial extent, while the expectation-maximization mechanism allows clusters to have different 
shapes. 

The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning 
technique for classification that is often confused with k-means due to the k in the name. One can 
apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data 
into the existing clusters. This is known as nearest centroid classifier or Roccio algorithm. Given a 
set of observations (x1, x2, ..., xn), where each observation is a d-dimensional real vector, k-means 
clustering aims to partition the n observations into k (<n) sets S = {S1, S2, ..., Sk} so as to minimize 
the within-cluster sum of squares (WCSS) (i.e. variance). Formally, the objective is to find: 

arg min > ||X — pil] 2 = arg min > |Silvar Si 

where ui is the mean of points in Si. This is equivalent to minimizing the pairwise squared deviations 
of points in the same cluster: 

arg min >. 1/2|Si| X ||X — Y||^2 

Equivalence can be deduced from identity is given by 

È ||X — pil] 2 = k xEsi YXFyEs(X — pi)(ui — y) 

Because the total variance is constant, this is also equivalent to maximizing the sum of squared 
deviations between points in different clusters (between-cluster sum of squares, BCSS), which 
follows easily from the law of total variance. 


5. Conclusions 

We are concluding that the proposed system can segregate genders of the silks moth in cocoon stage 
with acceptable accuracy and speed compared to traditional methods. The statistical approaches for 
calculating the average weight of the training samples for two different sex as threshold for variety 
of cocoon seeds is highly reliable. The high precision sensors with microcontrollers are fast enough 
to process the algorithm and reached segregation rate and accuracy of about 95%. 


6. Future scope 

The proposed sorting methods can be used in segregation of any light weight objects and large number 
of samples and accuracy can be improved using artificial intelligence and machine learning 
algorithms. 
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